计算机与现代化 ›› 2009, Vol. 1 ›› Issue (12): 33-35,1.doi: 10.3969/j.issn.1006-2475.2009.12.009

• 人工智能 • 上一篇    下一篇

一种基于支持向量机和聚类的Web挖掘新方法

苏意玲
  

  1. 华南师范大学南海校区实验中心,广东 佛山 528225
  • 收稿日期:2009-05-18 修回日期:1900-01-01 出版日期:2009-11-27 发布日期:2009-11-27

A New Method of Web Mining Based on Support Vector Machines

SU Yi-ling
  

  1. Experimental Center, Nanhai Campus, South China Normal University, Foshan 528225, China
  • Received:2009-05-18 Revised:1900-01-01 Online:2009-11-27 Published:2009-11-27

摘要: 针对日益增长的对Web数据挖掘的现状,本文提出了一种基于支持向量机和聚类的Web挖掘新方法,根据支持向量机中支持向量不会出现在两类样本集间隔以外的正确划分区的理论,通过引入聚类中的类质心、类半径、类质心距等概念,从而较好地解决快速而准确地删除非支持向量的问题,保证算法的泛化性。实验表明,采用这种改进的算法既能快速精确地对训练样本进行删减又较好地解决了泛化性问题。

关键词: Web挖掘, 支持向量机, 聚类

Abstract: For the growing of the status of Web data mining, this paper proposes a new approach of Web mining based on support vector machines and clustering, and taps new methods, according to the theory of correct divided areas that support vector will only appear in the interval of two types of sample collection, through the introduction of concepts of such as clustering center of mass, clustering radius and clustering centroid distance, thus resolves better the problem of fast and accurately remove non-support vector to ensure the generalization of algorithm. The experimental results show that this improved algorithm not only can fast and precisely delete the training samples but also has a better solution to the issue of generalization.

Key words: Web mining, support vector machine, clustering

中图分类号: